JIT, interpreter, AOT preinit, and CoreLib: Saturate float/double conversions to small integral types#128604
JIT, interpreter, AOT preinit, and CoreLib: Saturate float/double conversions to small integral types#128604Copilot wants to merge 21 commits into
Conversation
Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
…ll casts Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
…ing casts Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
| // Both operands and result must be of the same floating point type. | ||
| GenTree* op1 = tree->gtGetOp1(); | ||
| GenTree* op2 = tree->gtGetOp2IfPresent(); | ||
| assert(varTypeIsFloating(op1)); | ||
| assert(op1->TypeGet() == tree->TypeGet()); |
RISC-V64 and LoongArch64 fmin/fmax implement IEEE 754-2019 "minimumNumber"/"maximumNumber" semantics which replace NaN with the other operand (rather than propagating it). Using MaxNative/MinNative in the float domain would map NaN inputs to smallMin instead of 0. Fix by moving RISC-V64 and LoongArch64 to the same integer-domain saturation approach as ARM32: do R->int first (NaN->0 via the saturating cast), then apply NI_PRIMITIVE_SaturateToInt8/16/UInt8/16 intrinsics that lower to branch-based clamp sequences using bge/mov/instGen_Set_Reg_To_Imm. - morph.cpp: Move RISC-V64/LA64 from float-domain (#if FEATURE_HW_INTRINSICS||WASM) to integer-domain (#elif TARGET_ARM||RISCV64||LOONGARCH64) block - lsrariscv64.cpp: Add saturation intrinsics to GT_INTRINSIC handler; allocate temp register - codegenriscv64.cpp: Implement branch-based clamp in genIntrinsic for saturation intrinsics - lsraloongarch64.cpp: Add saturation intrinsics to GT_INTRINSIC handler; allocate temp register - codegenloongarch64.cpp: Implement branch-based clamp in genIntrinsic for saturation intrinsics Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
|
@copilot, some builds are failing with asserts similar to: This is because the new intrinsics aren't being handled under all the applicable paths. You should look at something like |
Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
Done. Audited the paths used by
Verified by building |
| case INS_ssat: | ||
| // imm1 = shift amount (must be 0 for no shift), imm2 = saturation bits N (1-32) | ||
| // Encoding: sat_imm field = N-1 stored in bits[4:0]; no shift (sh=0, imm5=0). | ||
| assert(insDoesNotSetFlags(flags)); | ||
| assert((imm1 == 0) && (imm2 >= 1) && (imm2 <= 32)); // required for encoding | ||
| imm = (lsb << 5) | (width - 1); // lsb=shift=0, width=N -> sat_imm = N-1 |
There was a problem hiding this comment.
Added assert(reg1 != REG_PC) / assert(reg2 != REG_PC) to the INS_ssat case to match the other Thumb-2 D0 encodings in this switch in 71abf12.
| case INS_usat: | ||
| // imm1 = shift amount (must be 0 for no shift), imm2 = saturation bits N (0-31) | ||
| // Encoding: sat_imm field = N stored directly in bits[4:0]; no shift (sh=0, imm5=0). | ||
| assert(insDoesNotSetFlags(flags)); | ||
| assert((imm1 == 0) && (imm2 >= 0) && (imm2 <= 31)); // required for encoding | ||
| imm = (lsb << 5) | width; // lsb=shift=0, width=N -> sat_imm = N | ||
|
|
There was a problem hiding this comment.
Added the same REG_PC assertions to the INS_usat case in 71abf12.
| level++; | ||
| break; | ||
| } | ||
| #endif // TARGET_RISCV64 |
There was a problem hiding this comment.
Updated the #endif comment to // TARGET_RISCV64 || TARGET_WASM to match the #if/#elif chain in 71abf12.
| // Copy src to dst (no-op when dst == src). | ||
| emit->emitIns_R_R(INS_mov, EA_PTRSIZE, dst, src); | ||
|
|
There was a problem hiding this comment.
Replaced the EA_PTRSIZE INS_mov with slli.w rd, rs, 0 (EA_4BYTE) so dst is a sign-extended 32-bit value before the full-register bge compares against the clamp bounds in 71abf12.
| // Copy src to dst (no-op when dst == src). | ||
| emit->emitIns_R_R(INS_mov, EA_PTRSIZE, dst, src); | ||
|
|
There was a problem hiding this comment.
Replaced the EA_PTRSIZE INS_mov with INS_sext_w (EA_4BYTE) so the input is normalized to canonical sign-extended form before the full-register bge clamp compares in 71abf12.
…t/usat encoding Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
Unchecked
float/double→ small integral type (sbyte,byte,short,ushort,char) conversions did not saturate at the small-type boundary. For example,(short)(double)32768.000000000007produced-32768instead of32767, diverging from the saturatingR → intcontract introduced in .NET 9.The JIT expands
R → smallTasR → int → smallT. The innerR → intsaturates, but the outerint → smallTonly truncates low bits. The CoreCLR interpreter and the ILC type preinitializer had analogous bugs and were updated for parity with runtime behavior. In addition, CoreLibTryConvertpaths forfloat/doubleto small integral types were updated to use direct casts on CoreCLR (while preserving existing MONO behavior under#if MONO).Summary
Fixes saturating semantics for unchecked
float/double→ small integral type casts across all supported JIT targets (xarch, x86, arm64, RISC-V64, LoongArch64, ARM32, and WASM), the CoreCLR interpreter, and the NativeAOT type preinitializer, and applies corresponding CoreCLR-sideTryConvertsimplifications that rely on the corrected cast semantics.Changes Made
src/coreclr/jit/morph.cpp—fgMorphExpandCast: Added target-specific handling soR -> smallsaturates at destination bounds across supported architectures:FEATURE_HW_INTRINSICStargets use float-domain min/max clamping before existing cast expansion.GT_INTRINSIC.R -> int32followed byNI_PRIMITIVE_SaturateTo{Int8,Int16,UInt8,UInt16}), preserving NaN→0 semantics from the saturatingR → intcast.NI_PRIMITIVE_SaturateTo{Int8,Int16,UInt8,UInt16}intrinsics and the WASM binaryGT_INTRINSIC(MaxNative/MinNative)produced by morph:gentree.cpp: added the fourSaturateTo*cases to the unaryGT_INTRINSICcost switch ingtSetEvalOrder, added aTARGET_WASMblock to the binaryGT_INTRINSICcost switch forMaxNative/MinNative, and added dump labels for the fourSaturateTo*IDs (fixes the'Unknown binary GT_INTRINSIC operator'assert seen on e.g.System.Half:op_Explicit(System.Half):byte).importercalls.cppIsTargetIntrinsic: added the fourSaturateTo*IDs to the ARM, RISCV64, and LoongArch64 switches sorationalize.cpp'sIsTargetIntrinsicassertion is satisfied.valuenum.cppfgValueNumberIntrinsic: added an explicit branch for theSaturateTo*IDs that gives the node an opaque unary VN, backed by newVNF_SaturateToInt8/Int16/UInt8/UInt16entries invaluenumfuncs.h, so VN no longer hits theassert(NI_System_Object_GetType)for these intrinsics.assertionprop.cppIntegralRange::ForNode: taught range analysis the exact result range for eachSaturateTo*intrinsic ([ByteMin, ByteMax],[ShortMin, ShortMax],[0, UByteMax],[0, UShortMax]).SaturateTo*: newINS_ssat/INS_usatThumb-2 instructions ininstrsarm.h, encoding/disassembly support inemitarm.cpp, LSRA/codegen wiring inlsraarm.cppandcodegenarmarch.cpp.SaturateTo*: branch-based clamp incodegenriscv64.cpp/codegenloongarch64.cpp, with LSRA support inlsrariscv64.cpp/lsraloongarch64.cpp.src/coreclr/vm/interpexec.cpp—ConvFpHelper: clamps tonumeric_limits<TResult>(the destination type) instead ofTIntermediate, so floating-point → small-integral conversions now saturate at the destination type's range.src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/TypePreinit.cs: float-domainconv_i1/conv_i2/conv_u1/conv_u2cases now explicitly saturate viaMath.Clamp(NaN → 0), so AOT-baked statics match runtime behavior regardless of which host JIT runs ILC. Original simple-cast version is kept commented out with a TODO to restore once the fix has propagated through the toolchain.TryConvertsimplifications (Double.cs,Single.cs,SByte.cs,Int16.cs):float/double→ small-integralTryConvertTo*/TryConvertFrom*paths now use a direct cast on CoreCLR, with the existing manual clamp preserved under#if MONO.DoubleTests.GenericMath.ConvertToIntegerTest/SingleTests.GenericMath.ConvertToIntegerTest(issue (short)(double)32768.000000000007 produces invalid result #116823 suppressions removed), added a JitBlue regression test for the float/double → small-integral saturation contract, and updated the interpreter test comment/coverage.